Skip to content

Conversation

@mollux
Copy link
Contributor

@mollux mollux commented Jun 1, 2025

Related GitHub Issue

Closes: #

Description

The current implementation of LiteLLM doesn't take into account cached tokens (both write and read), resulting in incorrect cost calculation.
This PR stores the relevant information (taking into account the Anthropic only cache_creation_input_tokens, see https://docs.litellm.ai/docs/completion/prompt_caching), and uses it during cost calculation and visualisation of the cached tokens.

Screenshot 2025-06-01 at 21 49 31

Test Procedure

  • configure a LiteLLM instance
  • expose a model that supports prompt caching following the OpenAI spec (so e.g. Anthropic models, no Gemini 2.5)
  • configure Roo code to use the LiteLLM model
  • validate that the prompt stats contains cache information, and that the cached tokens correspond to the values in the LiteLLM request logs.

note: depending on the model, the cost calculation can be wrong in liteLLM (e.g. for Anthropic models), as cache calculation has some quirks there too. This doesn't change the usefulness of supporting that cache info in Roo Code

Type of Change

  • 🐛 Bug Fix: Non-breaking change that fixes an issue.
  • New Feature: Non-breaking change that adds functionality.
  • 💥 Breaking Change: Fix or feature that would cause existing functionality to not work as expected.
  • ♻️ Refactor: Code change that neither fixes a bug nor adds a feature.
  • 💅 Style: Changes that do not affect the meaning of the code (white-space, formatting, etc.).
  • 📚 Documentation: Updates to documentation files.
  • ⚙️ Build/CI: Changes to the build process or CI configuration.
  • 🧹 Chore: Other changes that don't modify src or test files.

Pre-Submission Checklist

  • Issue Linked: This PR is linked to an approved GitHub Issue (see "Related GitHub Issue" above).
  • Scope: My changes are focused on the linked issue (one major feature/fix per PR).
  • Self-Review: I have performed a thorough self-review of my code.
  • Code Quality:
    • My code adheres to the project's style guidelines.
    • There are no new linting errors or warnings (npm run lint).
    • All debug code (e.g., console.log) has been removed.
  • Testing:
    • New and/or updated tests have been added to cover my changes.
    • All tests pass locally (npm test).
    • The application builds successfully with my changes.
  • Branch Hygiene: My branch is up-to-date (rebased) with the main branch.
  • Documentation Impact: I have considered if my changes require documentation updates (see "Documentation Updates" section below).
  • [] Changeset: A changeset has been created using npm run changeset if this PR includes user-facing changes or dependency updates.
  • Contribution Guidelines: I have read and agree to the Contributor Guidelines.

Screenshots / Videos

Documentation Updates

Additional Notes

Get in Touch


Important

Adds cached token cost calculation to LiteLLM, updating model info and usage data reporting.

  • Behavior:
    • Adds cached token cost calculation in LiteLLMHandler in lite-llm.ts.
    • Updates getLiteLLMModels in litellm.ts to include cacheWritesPrice and cacheReadsPrice.
  • Cost Calculation:
    • Updates calculateApiCostOpenAI usage to include cache write and read tokens in lite-llm.ts.
  • Usage Data:
    • Adds cacheWriteTokens and cacheReadTokens to ApiStreamUsageChunk in lite-llm.ts.

This description was created by Ellipsis for 3716053. You can customize this summary. It will automatically update as commits are pushed.

@mollux mollux requested review from cte and mrubens as code owners June 1, 2025 19:57
@dosubot dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. bug Something isn't working labels Jun 1, 2025
@mollux mollux changed the title Add cached read and writes to cost calculation for LiteLLM Add cached read and writes to stats and cost calculation for LiteLLM provider Jun 1, 2025
Copy link
Collaborator

@mrubens mrubens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jun 1, 2025
@mrubens
Copy link
Collaborator

mrubens commented Jun 1, 2025

Let me know if I can help getting the tests to pass

@dosubot dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. and removed size:XS This PR changes 0-9 lines, ignoring generated files. labels Jun 1, 2025
@mollux
Copy link
Contributor Author

mollux commented Jun 1, 2025

The compile error should be fixed, but I don't get the platform unit test failure.
It seems unrelated, but I may be missing something.

@mollux
Copy link
Contributor Author

mollux commented Jun 1, 2025

same failure appears on other PR's, e.g. #4206 and #4210, and seems to be introduced in 5e50c55

so probably unrelated for this PR.

@mrubens mrubens merged commit dca1076 into RooCodeInc:main Jun 1, 2025
9 of 11 checks passed
@github-project-automation github-project-automation bot moved this from Triage to Done in Roo Code Roadmap Jun 1, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Jun 1, 2025
@mrubens
Copy link
Collaborator

mrubens commented Jun 1, 2025

Ah ok, will try to figure it out separately. Thank you for the PR!

@taylorwilsdon
Copy link
Contributor

taylorwilsdon commented Jun 1, 2025

I saw the mention of this in #4210 and will have it fixed in that branch if I can save you the time @mrubens (edit - fixed and all passing now)

@hannesrudolph
Copy link
Collaborator

@mollux can you please shoot me a discord DM at hrudolph?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working lgtm This PR has been approved by a maintainer size:S This PR changes 10-29 lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

4 participants